Virtual example for phonotactic language recognition

نویسندگان

  • Rong Tong
  • Bin Ma
  • Haizhou Li
چکیده

One challenge in spoken language recognition is the availability of training data. In this paper, we propose a virtual example construction method to derive artificial training examples from the existing training data. Using the proposed method, both target virtual examples and non-target virtual examples can be derived from the available training samples. An iterative virtual example selection method is proposed to select those virtual examples that may provide extra discriminative information for language separation. By incorporating virtual examples in language classifier training, the language recognition performances are improved for both closed-set and open-set tasks. Specifically, for LRE 2009 evaluation data of three durations: 30seconds, 10-seconds and 3-seconds, the language recognition performance improved by 3.67%, 11.98%, 6.42% respectively in closed-set conditions, and 10.14%, 10.55%, 5.75% respectively in open-set conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards High Performance Phonotactic Feature for Spoken Language Recognition

With the demands of globalization, multilingual speech is increasingly common in conversational telephone speech, broadcast news and internet podcasts. Therefore, automatic spoken language recognition has become an important technology in multilingual speech related applications. For example, automatic spoken language recognition has been used as a preprocessing component for spoken language tr...

متن کامل

A Language Independent Approach To Acquiring Phonotactic Resources for Speech Recognition

Building and developing linguistic resources for languages is of prime importance with many areas of application. This paper focusses on a fully automatic approach to the aquisition of a syllable phonotactics for a particular language. In this approach the phonotactic constraints for a language are encoded in a finite-state phonotactic automaton the structure of which can be automatically deriv...

متن کامل

Fusing language information from diverse data sources for phonotactic language recognition

The baseline approach in building phonotactic language recognition systems is to characterize each language by a single phonotactic model generated from all the available languagespecific training data. When several data sources are available for a given target language, system performance can be improved using language source-dependent phonotactic models. In this case, the common practice is t...

متن کامل

Parallel Acoustic Model Adaptation for Improving Phonotactic Language Recognition

In phonotactic language recognition systems, the use of acoustic model adaptation prior to phone lattice decoding has been proposed to deal with the mismatch between training and test conditions. In this paper, a novel approach using diversified phonotactic features from parallel acoustic model adaptation is proposed. Specifically, the parallel model adaptation involves independent mean-only an...

متن کامل

Selecting phonotactic features for language recognition

This paper studies feature selection in phonotactic language recognition. The phonotactic feature is presented by n-gram statistics derived from one or more phone recognizers in the form of high dimensional feature vectors. Two feature selection strategies are proposed to select the n-gram statistics for reducing the dimension of feature vectors, so that higher order n-gram features can be adop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014